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ABSTRACT 

COCA, which consists of both authoring tools and a 
runtime shell, is a system intended to provide teachers with genuine 
access to intelligent tutoring system (ITS) technology and to give 
them control over domain material and teaching strategies. To 
evaluate the effectiveness of COCA, 10 subjects (five university 
teachers and five school teachers) were given an authoring task and 
asked to complete a questionnaire. The subjects were presented with 
domain material (the American Revolution), an initial teaching 
strategy and a meta-teaching strategy in the form of prepared 
knowledge bases, and a set of instructions for altering the teaching 
behavior of the final system and extending the domain material. The 
questionnaire covered the general use of COCA, the suitability of the 
teaching strategy representation, the domain representation, and how 
subjects' opinions of AI (artificial intelligence) in general and 
COCA in particular had changed as a result of the task. Results 
indicate that school teachers' attitudes toward AI improved and 
university teachers' attitudes remained positive; COCA was found to 
be successful for simple tutoring systems, yet too difficult to use. 
(Contains 17 references.) (aEF) 
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Abstract: This paper discusses an evaluation of COCA, a system that gives 
teacliers control over domain material, teaching strategy and incta-teaching 
strategy. The purpose of this evaluation is to study more fully the effectiveness 
of the system. Ten subjects were given an authoring task. The resulting 
knowledge bases, together with a questionnaire, made up the experimental 
results. This study shows the strengths and weaknesses of the COCA approach 
and whether it has helped improve teachers' attitudes towards AI. The results 
show tliat the system was successful, yet too complex. Results are tentative 
due to the size of the experiment. 
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1 Introduction 



PIoLh^p'""' 'T'no^^A^f of different teaching styles in the classroom. Examples include 

the P.lo \den Repor (196^ , looking at teaching methods used in primary schools, studies of formal and in- 

nr/Fa on I "^ll' Tv'""'l' ^'^'^h °^ '^^^'^'"S stylesofcomputer-b^ed train- 

Li: ?' J^" suggests that many teachers have rejected educational software as 

it uses the wrong teaching style and is not available in the right subject area. Only recently have intelli- 
fllfc. " ffl'"^ system.s ITSs) really been able to address these problems, as systems'ha "begun to en^rge 

. h nnTr.llowi^'.? 'r'';"^f^'f ^^P'"u''^' Elfom-Cook, Byerley, Brooks, Federici, & Scaroni, 1990) 
of lV<?. n / r change those styles. Consequently, there is little or no actual evaluation 

Ji ;''^.^'^sroom which support the teacher with a range of teaching strategies, and few or no 
3in;'itl"1''' P"*"'"^'^ (^^^"7% ^^^^A- of evaluation has been highlighted ^ 

Zirrt ? u" F^'^' ^ P^i"- 1993), with an increasing need f^r 

re.searchers to use appropriate methods for evaluating their results 

COCA, which consists of both authoring tools and a runtime shell, is a system intended to provide 
teachers with genuine access to ITS technology. Teachers are either uninformed about using AI techniques 
m schools, or do not have the resources to U:e them, or more seriously, do not trust the declion making of 
a ystem which would be controlling some of the teaching in their classroom. As the result of a n^mbe? of 

need'Sm "^^ i'^Zet'^'^T''!' p"^°' ^'^^^f^ requirements from teachers about what tLv 

need from an intelligent assistant. Predominant amongst these was the ability to control the teaching 

JionfnfinSZd.^r °^ '^T 'TrT'^'l' °" basis of the use of m ta-1 vel 

reasoning in the fields of planning and knowledge-based systems research, a prototype system COCA-0 
was built^ This then informal y evaluated with teachers, resulting in a more complete sSSn COCA 
described ,n Major k Reichgelt (1992), which gives a great deal of control oveT the teaching st vie of 
the system to the teachers themselves. The initial evaluation of COCA was the reconstruction of the 

:a1;e;Vsed'cg^^^^^^^ ^T^'''''^^- - MaSr rRefcS 19 

r SJSer^a'n^Sfop^ Isi^ie rTTv^'iiVScA-^'fuJronn^ tS^^' ^'"'^ ''^^ 

to.rCrl ^^^a' ^'f "':es a more formal evaluation of COCA-1, comparing school teachers and university 

ment Thel^bWt^ "^h'*"''' ^"""^f "f*"^ AI in the classroom before and after the exp r- 

Soimaire Bo l\hVkl^^^^^^^ were asked to perform a given authoring task and to fill fn a 

questionnaire. Both the knowledge bases produced during the task, and the completed questionnaires 

Sailedte ofTociv • °^ ''^ -P-™-^ '^'^^ °^ ^^P-™^"^ to e^afuate a n'ore 

detailed use of COCA s authoring tools, and to discover if attitudes towards using AI in the classroom 
had been improved as a result of using COCA. ciassroom 

rnPr ""'^'^^sity teachers were familiar with ITSs although none of the subjects was experienced in usiiic 
COCA. Some of the school teachers had used computers before and some had no computer experience 



2 Background 

G~ TitToV^llnw'J ^' ^^'T'^' °^ '"^"^''"""S tools enabling teachers to huild ITSs dir.ctlv ' 

~ f VrniMfic . T " l<»°"-ledgo and tutoring strategy knowledge to he created and edited K \FIT^ 

^ facilitates teacher involvement in the ITS design process, alongside the knowledge engineer COCA iio^e 
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aim hcLs been to allow the teacher to work alone with the system, is necessarily a !ess complex system. 
KAFITS has been evaluated in a case study with one high school teacher and two education graduates 
(Murray, 1993), This was very much a qualitative evaluation but nonetheless gave some encouraging 
results in terms of productivity, with about 85 hours of effort needed for an hour of instruction. 

Although there have not been any other evaluations of ITS construction tools, a certain amount of work- 
has been done on suitable evaluation for ITSs, which naturally concentrates more on the use of the system 
by students rather thari teachers. Different evaluation techniques have been studied (Mark & Greer, 1993) 
in order to point towards evaluation methodologies for ITSs, This work is relevant to COCA as it discusses 
the evaluation of different aspects of an ITS architecture, and most notably, the teaching strategy. It 
suggests changing the teaching strategy while keeping other components fixed as a possible means of 
evaluating effectiveness. Of all the techniques discussed, formal experimental techniques are put forward 
as suitable for summative evaluation of different components of the system. 

Educational impact (Littman ic Soloway, 1988) is a major test of an ITS in terms of both achievenient 
and effect, expressed by how well a student learns and what attitudes are retained after that learning. 
Although we are not evaluating the ITSs built with COCA, we can still use these criteria with teachers 
to see whether they can use COCA's tools successfully, and also what effect this has on their attitude 
towards using AI tools in the classroom. These ideas are reflected in the aims of this evaluation of COCA, 

2.1 General evaluation methods 

A general discussion of different methods for evaluating software usability is given in Macleod (1992). 
One of the quickest methods of evaluation would be to give the system to teachers, asking them to use 
it as much as possible, and to write a report. This helps find important problems with the system, but 
is very dependent on the teacher. It is also similar to the case study reported in Major (1993a) and so 
would be of less value to us here. 

Analytic evaluation methods are used early in the design process. They may employ a model of the user 
and apply it to a specification of the system. This model basically predicts the different interactions likely 
to be performed by the user. The advantage is very cost effective evaluation, done before implementation. 

Controlled experimental studies of software usability are difficult because there are so many dependent 
variables when it comes to using a computer, such as the users themselves and their attitudes, the tasks 
they perform and the different environments in which they work. The standard procedure is to separate 
groups of subjects and to get them to perform the same experiment with one crucial variable changed. 
The results of the different groups are used to prove or disprove hypotheses about that variable. But 
isolating one such feature in COCA's context would be a very difficult exercise. An assessment of the 
value of controlled experiments with regard to evaluating software is given in Monk (1985), 

Survey methods (Chin, Diehl & Norman, 1988) are the cheapest form of obtaining evaluation infor- 
mation. They are usually done by questionnaire. Quantitative data can be obtained if more than ten 
subjects are used, as well as qualitati\e data. Such a method would certainly be applicable with COCA, 
as long as the questions are directed towards the aims of the study. 

A final set of methods are known as observational methods. This involves watching users working 
with the system, either directly or by the use of video, and recording the process. This information can 
then be analysed. This is more costly than using a questionnaire, but allows better access to the task 
itself. One such technique incorporates a think-aloud protocol (Wright, Monk & Carey, 1991). 

The methods most applicable to COCA are survey and observational methods. The survey can be 
used by asking direct questions about COCA*s usability and suflRciency, and about attitudes towards rules 
for representing teaching strategies, COCA allows users to save their knowledge bases and these can be 
analysed to allow observational evaluation. Further users could record their interaction with COCA by 
noting all the decisions they make and the functions they use when performing a task, 

3 Experimental aims 

The primary aim of the experiment was to assess whether or not COCA will be useful to teachers and 
authors. A number of issues were considered which are numbered below: 

1 A first issue was the teacher's attitude towards AI- techniques as opposed to traditional computer- 
assisted learning. The experiment investigated the- teachers* perceptions of the changes in their at- 
titude. It should be noted that this is a subjective measure. Although it could be suggested that 
the teachers' perception of their attitude change will always increase with an environment change, wc 
observed in Major (1093a) that a system not offering control over teaching decisions would have been 
a serious disincentive to teachers with respect to using AI in the classroom. 

2 Another question of particular interest was whether teachers considered they had had a conscious 
view of their strategics before using COCA, This is also a subjective measure. 

'i iMirther, the experiment set out to discover whether teachers felt that the iieuristics [)rovid(Hl by 
COCA were likely to be sufficient to control teaching behaviour. 



362 



■1 Al a iiioro prHclical lcv(.l, the experiment, set out to sec whicli of COCA s fkfillf ;„ , .. ; • r 
til.- teachers, were easy to use. and wl,ich were little used an 1 difhcult Furtl e a^^ 
w ... l. are,.s of authoring support subjects felt were likelv to 1 ave to 1 e ex ended^ 
rins ,s nnportant when considering how future research can best build on COC^ 

4 Evaluation method 

SLS''^' ^7 ""'^ ^'^ 'l"' "f'''^ American Revolution, which together with its 

TSelt [,f';'S>:^^"d;"^\^:«'^^l«gy^f^°"^'\d part of a conference demonstration of COCA Ma or a d 
tl. SnrT" 'fi^ largely declarative in nature, makes ut of A broad 

.net..s,ra.egi;^?r%°nT.?St;;stt ^^^^^ttl^^^^^^^^^^^ ^''^ 

s;-'cx;cZ^^;-hS-^S^ 

ii:KhS~ 
s=^o:te^ 

' "onr ^"^"'l'" °5 efficiency by co„,p„i„g Ih, „„„be, of rules fo Sve .iVe„ 3i 
As mai.y q„es ions as possibK/e liven a cksed sK ^niiht ? '„r 

5 Results 

5.1 Knowledge bases produced 

The knowledge bases produced were tested and analysed to see how much of the t^c^W K . i * j 
succIssmT, r,XK,;adtn,:ScXr„fc^^^^ """."'^ -ii'"' """"'^Lv 

don,a,n authoring tools was not satisfactory and sh'ows how to mpfo^'f u Jre 4rsions of COCA^°^^ ' 

The first four subjects were all school teachers. Subject 2 in part cula seer^s'tXv. ■ j 

ah e trannng w,th the system before being able to make any reSable pro^^ s w^t^^^ 
vMth the least computer experience did least well, which is an exnected resn^ Th r V ? ^"^J^^'.^ 
subject varied between 2 and 3 hours Notes taken bv f he snhl?f! j ^""^ 

^.■lHio.^,1ic.co,c.,,oJ:1f,,t•■^ilris"ls'lr:l;?o?c^s?,rc«n1l3s - 
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Task Description 


Subjects 


r 








1 


2 


3 


4 


5 


6 


7 






Domain 


Chauige name 


V 


Y 


V 


Y 


V 


Y 


Y 








Add objects 


Y 


Y 


Y 


Y 


Y 


Y 


Y 








Place objects 


Y 


Y 


Y 


Y 


Y 


Y 


Y 








User attributes 


N 


N 


N 


Y 


Y 


Y 


N 






Subtotal 1%) 




75 


75 


75 


100 


100 


100 


75 


86 


13 


S trAt<*flrv 


Change name 


Y 


Y 


Y 


Y 


Y 


Y 


Y 






Stay after fail 


Y 


N 


Y 


Y 


Y 


Y 


N 








Fill in blank test 


Y 


N 


Y 


Y 


Y 


Y 


Y 








Categorise test 


Y 


Y 


Y 


Y 


Y 


Y 


Y 








Test on dates 


Y 


N 


Y 


Y 


Y 


N 


Y 








Move summary 


N 


N 


N 


N 


N 


N 


N 








Swap date/ causes 


Y 


N 


Y 


N 


Y 


N 


N 








Teach user attributes 


N 


N 


N 


N 


N 


N 


N 






Subtotal (%) 




75 


25 


75 


63 


75 


50 


50 


59 


19 


Meta-strategy 


Chemge name 


V 


Y 


V 


V 


V 


Y 


Y 






Remove implications 


Y 


N 


N 


Y 


Y 


Y 


N 








Restore implications 


Y 


N 


N 


Y 


N 


Y 


N 






Subtotal (%) 




100 


33 


33 


100 


67 


100 


33 


67 


34 


Total (%) 




80 


40 


67 


80 


80 


73 


53 


68 


16 


Score (max 15) 




12 


6 


10 


12 


12 


11 


8 


10.1 


2.3 



Table 1; Percentages of tasks achieved by each subject 



The efficiency for each subject is then calculated as the number of completed V^ks per change to the 
knowledge base made, relative to this number for an optimal solution. 





Subjects 








Optimal 


1 


2 


3 


4 


5 


6 


7 


Domain changes 
Strategy changes 
Meta-strategy changes 


4 

13 
3 


3 
7 
3 


3 
5 
1 


3 
10 
1 


4 
9 

3 


4 
11 
3 


4 
8 
3 


3 
6 
1 


Total changes 


20 


13 


9 


14 


16 


18 


15 


10 


Score 


15 


12 


6 


10 


12 


12 


n 


8 


Efficiency 'gHf;;' x 100% 


100 


123 


89 


95 


100 


89 


98 


107 



Table 2: Relative efficiency for each subject 



We can see that although we have only an approximation to the efficiency of the solutions, tiiat thoy 
are all verv close to the optimal solution. The reason that it is possible to be more efficient liiaii tiie 
optimal solution is that the more difficult tasks, which are completed in the optimal solution, roqmre 
more changes to the knowledge bases than the majority of the tasks performed by the subjects. So a 
subject who has not attempted the more difficult tasks will have made fewer changes m proportion to his 
score than the optimal solution, and thus will have a higher efficiency. Subject 2 scored relatively lughly 
despite not achieving manv of the tasks. This was because the tasks she did achieve were done efficiently, 
and she did not make many attempts at those she could not manage. A lower efficiency also shows that a 
subject has taken longer to perform the task in comparison to what was achieved. Generally, we can s«^e 
that the subjects managed to make the changes to the knowledge bases without using more rules than 
necessary, and so that COCA-Ts flexibility is not at a high cost in terms of long editing sessions. 

5.2 Questionnaire findings 

Some of the questions asked for a mark on a lOOmm scale, whereas others asked for more subjective 
opinions about teaching strategies and the use of AI in general. Table 3 gives the results. The scores ar(^ 
given with 0 as the worst possible score and 100 as the best. The average and standard deviation are also 
given. The subject numbers correspond to the numbers in tables 1 and 2 and are included for reference*. 

The first o subjects are school tt^achers and the rest university teachers. Firstly, and most generally, 
we can see that the majority of averages are over 50, putting them in the positive^ half of the scale 
a.s far as COCA is concerned. Question 3.3, regarding the use of variables in rules, shows the largest 
st.'indard deviation. This results from the university teachers suggesting that they wore strongly in favour 
of variables, whereas school teachers felt they would be highly unlikely to use them. On the other hand, 
question regarding the practicality of building libraries of teaching strategies with COCA for general 
usr, h.ul the lowr^st standard deviation and the highest degree of consensus. This shows the potential of 
a sv^trm lik'^ COCA within a school or other teaching environment. Other points to notice are the high 
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1.2 
1.3 

!..'> 
1.6 
1.7 
2.3 
2.7 
2.8 
3.3 
3.6 
3.7 
3.8 



pGscripl ion 



Tcnninology dear 
Confidcnl usage- 
Gn ting lost 
Broad usage 

IJI easy 
Tools useful 
ncMiiain tree clear 

Course libraries 
Domain distortion 
Likely to use variables 

Suits your teaching 
Decisions easily made 
Strategy libraries 





1 


2 


3 


4 


35 


50 


27 


44 


74 


4} 


40 


6 


40 


31 


57 


17 


7.^ 


44 


64 


- 


87 


50 


30 


51 


34 ^ 


51 


36 


37 


70 


79 


51 


31 


54 


95 


80 


38 


56 


43 


96 






92 


69 


89 








29 


87 


64 


51 


3 


58 


4 


70 


81 


95 


66 


89 


24 


80 




63 


92 




39 


99 


81 


93 



Su bjorf.s 



H8 
61 
68 
89 
21 
90 
100 



98 
30 
11 
93 



6 




7 




85 


94 


4 .*) 


72 


86 


22 


82 


55 


75 


61 


52 


46 


85 


95 


23 


73 


83 


93 


66 


90 


83 


94 


97 


86 


99 


96 


15 


94 


78 


32 


93 


96 




67 


96 




100 


93 




83 


48 


44 


93 




87 


26 


92 


86 


98 


86 


94 


83 



64 
46 
56 
65 
58 
76 
72 
78 
t;a 

62 
68 
62 
91 



24 
25 
17 
27 
26 
23 
31 
23 
35 
37 
23 
33 
6 



Table 3. Answers to questions using a 100mm scale 



^roros. particularly amongst the school teachers, to question 3.6. regarding the completeness of COPA's 
t<^aching strategy model with respect to their own teaching stvle'. The only core to be stro^^^^^^^ 
was that of question 1.3. regarding whether users often felt lost in the svstfm I is clelr tha^ COCA's 
nsor m erface needs to be improved to increase users' confidence and stop them getting tost 

1 he next part of the questionnaire measured changes ol attitude towards teach in fr<:trAtPtrip<: .n^ at 
in (he classroom. We shall concentrate on our school t-eachers as it ^IS aUi tud^^^^ 
nioie miportant. The two questions of interest to us are question i.8. regarding the use of AI onnosed 
VUn?.T ^i^sroom and question 3.10, which asked about attitudes towards eachinfi^tra^^^^^^^^^ 

W t hnnf f n^^ comments made in the unstructured comments section cf the qu^ti^^^^^^^ 
Without exception al our school teachers felt that COCA had shown them that AI coul^^e usefu ?^ 
he classroom and nad something to offer. They of course mentioned the prob ems of t^^^^^ 
terminology, but could see the underlying usefulness of the system. With regard to teachinc 1 atL 
none of t he ..chool teachers had had as structured a view of strategies COCA Four ^^^^^ 
orcing the user to consider their teaching in such a way was usfful. None su.^^^^^^^ this vie v of 

Leaching made authoring either difficult or distorted. suggesied tiiat tins Me a ol 

A number of other points were made in the responses to the other more open questions The mair 

re^'^ st'udenTn.odd 
represciuaiion. llie tirst three of these would be the main effort in any development of COCA's anthnrinn- 

oois for commercial use Although the student model is weak it could easily be ex ended to allow anf 
toacher-defined attribute (psychological/pedagogical) to be given to a student and to control the t 'a hin7 
Tiie domain representation could also be extended, with perhaps the strateev interDretVr .i^^^^^^^^ 
domain interpreter more control, thus allowing large^ and more interSing p eces of dor^^n r^ateria o 
he taught Some other strengths of COCA that subjects mentioned were i s potentiaTfo? Torm^hLnl 

eaciung hrough the strategy model, and thus to become a useful tool for trainee teachers^^^^^^^^ 
thought that being forced to consider teaching decisions beforehand would be nrofinKlp rnT;.. Mil 
very much with the mixed-ability problem in the classroom Finany tS felt ^^at^t^^^^ 
the student rating thresholds (i.e. the student modelling behaviour) l^Mry usll ' 

6 Concluding discussion 

'"%hhor?';J" °''k^^^'"/^ "^'"S -dTn'p\rtTculI?lhe ^ e o^ a Wnfst^ 

Although the number of subjects was not arge enough to make any catefforicalTMpm!^^^^^^^ 
us a ba-sis upon whicii to examine the different attitudes tow^ds (^OCA and So f ex^^^^ 
between subjects. Those subjects who were university teachers were U'nirtnv i differences 
the use of computers and were familiar with AI te hn Desnite\^^^^ 

msmmmmm 
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The second aim of the experiment concerned the subjects' opinions of their views on teaching strate- 
gies. Those subjects who did not feel that they had had an explicit concept of a strategy were typically 
the school teachers. They suggested that the formalising of a process that had previously been implicit 
Wcis useful. The university teachers felt they had had a conscious view of strategies before. 

With regard to the third aim, that of the sufficiency of COCA's strategy heuristics, the subjects said 
that COCA was indeed sufficient for their requirements. Further, there was strong support that building 
up libraries of strategies would be a practical way of using COCA. Those subjects who already had an 
idea of using strategies, typically university teachers, felt that COCA^s model of a strategy and facilities 
for building strategies were sufficiently flexible for their needs. Indeed no subjects suggested aspects of 
the teaching process that were not catered for in COCA's tools. 

The final aim was to discover those aspects of the system that were easy or difficult to use, and 
thus which aspects might need to be extended. With regard to this aim the experiment has also shown 
a number of weaknesses with COCA. We have already mentioned the domain representation and stu- 
dent modelling. Other points included the need for better and more comprehensive documentation and 
improvements in the interface. None of the subjects felt the teaching strategy model was a weakness. 
However, the fact that a number of tasks were not completed suggests that COCA s authoring tools need 
to be made more intuitive. 

In summary we can say that COCA is usable enough for simple tutoring systems to be built, suggesting 
that a system like COCA should be pursued further, particularly with regard to the points arising from 
the final aim of the experiment. The school teachers who were suggesting the use of strategy and domain 
knowledge base libraries were highlighting a real problem for a system of COCA 's type, namely that power 
is required at the strategy level, and yet che authoring task must be very simple if teachers are going 
to use a system on an everyday basis. To address this, a new version of COCA is under development, 
which very much simplifies the authoring task by hiding ail rules from the teacher and giving graphical 
controls for strategy construction. This new version, running under Windows 3.1 on a PC, will be used 
as the basis of further experiments to investigate the nature of meta-strategic knowledge. 
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